Understanding the Open Directory Project through Formal Concept Analysis

نویسنده

  • Avin Mathew
چکیده

Directories may contain certain abnormalities in their structure certain categories may contain unrelated sub-categories, whilst other categories may have integral subcategories that are absent. By understanding these irregularities, the effectiveness and efficiency of information management can be greatly improved. The Open Directory Project (ODP) is one such directory that aims to categorise the Internet. The ODP is an ambitious project which spans over 500,000 categories with categories ranging from archery to zebras. However, the uniqueness of the directory arises from its editorial structure. With over 57,000 editors, and each given authority to moderate a section of the directory, this characteristic makes the ODP one of the largest public edited directories. However, with such a large editorial base, do irregularities or regularities arise within the directory structure? This project explores the use of Formal Concept Analysis (FCA) in understanding the structure of the ODP. Although the underlying driver of FCA is the ability to generate concept lattices from data, research in FCA has provided a variety of analysis tools. Initial analysis of generated concept lattices indicated that the ODP may be amenable to a horizontal decomposition. However, the sheer size of the directory made computing the entire concept lattice intractable. Thus, an alternate method was developed to perform a horizontal decomposition by computing only sections of the lattice. The techniques generated in this analysis can be applied to other domains, such as file system and software package structures.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analysis of interactions among barriers in project risk management

In the context of the scope, time, cost, and quality constraints, failure is not uncommon in project management. While small projects have 70% chances of success, large projects virtually have no chance of meeting the quadruple constraints. While there is no dearth of research on project risk management, the manifestation of barriers to project risk management is a less dwelt topic. The success...

متن کامل

Modelling and Evaluation of Co-Evolution in Collective Web Memories

The constantly evolving Web reflects the evolution of society in the cyberspace. Projects like the Open Directory Project (dmoz.org) can be understood as a collective memory of society on the Web. The main assumption is that such collective Web memories evolve when a certain cognition level about a concept has been exceeded. In the scope of our work we analyse the New York Times archive for con...

متن کامل

The ToscanaJ Suite for Implementing Conceptual Information Systems

For over a decade, work on Formal Concept Analysis has been accompanied by the development of the Toscana software. Toscana was implemented to realize the idea of Conceptual Information Systems which allow the analysis of data using concept-oriented methods. Over the years, many ideas from Formal Concept Analysis have been tested in Toscana systems while the real-world problems encountered led ...

متن کامل

Design and Implementation of a Web directory for Medical Education (WDME): a Tool to Facilitate Research in Medical Education

Introduction: Access to the medical education resources on the web is one of current challenges for researchers and medical science educators. The purpose of current project was to design and implement a comprehensive and specific subject/web directory of medical education. Methods: First, the categories to be incorporated in the directory were defined through reviewing related directories an...

متن کامل

Symbolic links in the Open Directory Project

We present a study to develop an improved understanding of symbolic links in web directories. A symbolic link is a hyperlink which makes a directed connection from a webpage along one path through a directory to a page along another path. While symbolic links are ubiquitous in web directories such as Yahoo!, they are under-studied and, as a result, their uses are poorly understood. A cursory an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003